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Abstract 

In 1991 Stute introduced a class of estimators called conditional {/-statistics. They can be seen 
as a generalization of the Nadaraya- Watson estimator for the regression function, and he proved 
their strong pointwise consistency to 

m(t) :=E[.g(Yi,...,y,„)|(Xi,...,X,„) =t], t G R". 

Very recently, Gine and Mason introduced the notion of a local {/-process, which generalizes 
that of a local empirical process, and obtained central limit theorems and laws of the iterated 
logarithm for this class. We apply the methods developed in Einmahl and Mason (2005) and 
Gine and Mason (2007a, b) to establish uniform in bandwidth consistency to m(t) of the estimator 
proposed by Stute. 

Keywords, conditional [/-statistics, empirical process, kernel estimation, Nadaraya-Watson, regres- 
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S: 

1 Introduction and statement of main results 

X 

Let (X, Y), {Xi, Yi), . . . , F„) be independent random vectors with common joint density 
function / : M x M ^ [0, cx3[, and for a measurable function cp : —>■ M, consider the regression 
function 

m^(t) = E [if{Yi, . . . , Y„MXi, ...,Xm)=t], t e M™. 

Stute [TT] introduced a class of estimators for m^(t), called conditional [/-statistics and defined 
pointwise for t G as 
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m„(t; hn) = /- J ^ , (1.11 



where 

C = {(^1, ...,im):l<i,<n, I, ^ H if 3 ^ /}, (1.2) 

and < /i„ < 1 goes to zero at a certain rate. Soon afterwards, Sen [10] obtained results on 
uniform consistency of this estimator. We shall adapt and extend the methods developed in 
Einmahl and Mason [5] and Gine and Mason [HI [7] to show that under appropriate regularity 
conditions a much stronger form of consistency holds, namely uniform in bandwidth consistency 
of m„. This means that with probability 1, 

limsup sup sup |m„(t; /i) — m^(t)| = 0, (1.3) 

for — oo < c < d < oo and a„ < 6„, as long as a„ — > 0, 6„ — > and hn/'dn oo at rates depend- 
ing upon the moments of ^{Yi, . . . , Ym)- Moreover, we shall show that (11.31) holds uniformly in 
if E as well. In fact, our results extend those of Einmahl and Mason [5J, who treat the case 
m = 1. 



We shall infer (II. 3p via general uniform in bandwidth results for a specific [/—statistic 
process indexed by a class of functions. We define this process in (II. 4p below. Towards this end, 
for m < n, consider a class of measurable functions g : M"^ R such that Kg'^{Yi, . . . , Ym) < 

00, which satisfies the following conditions {F.i) — {F.iii). First, to avoid measurability problems, 
we assume that 

{F.i) ^ is a pointwise measurable class, 

1. e. there exists a countable subclass jFg of JF such that we can find for any function g E a. 
sequence of functions gm G To for which gm{,z) g{z), z G M™. This condition is discussed in 
van der Vaart and Wellner |12j. We also assume that JF has a measurable envelope function 

(FM) F{y)>snp\g{y)\, y G M'". 

Finally we assume that is of VC-type with characteristics A and v ( "VC" for Vapnik and 
Cervonenkis) , meaning that for some A> 3 and v > 1, 

{FMi) Ar(^,L2(Q),£)< (^^^^1^^ , 0<£<2||F|U,(Q), 

where for e > 0, Afi^J-", L2{Q),e) is defined as the smallest number of L2{Q) open balls of radius 
e required to cover JF, and Q is any probability measure on (R™, S) such that ||-F||l2(Q) < 
(If [F.iii) holds for JF, then we say that the VC-type class JF admits the characteristics A and v.) 



Let now i^' : M ^ M be a kernel function with support contained in [—1/2, 1/2] satisfying 

{K.i) sup |-R'(a;)| =: k < oo and / K[x)dx = 1. 

xeM J 
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For such kernels, we consider the class of functions K, := {hKh{t — ■) : h > 0,t & M} and assume 
that 

{K.ii) /C is pointwise measurable and of VC-type , 

where as usual Kh{z) = h^^K^z/h), z G M. Furthermore, let 

m 

{K.iii) K{t):=^K{tj) 

denote the product kernel. Next, if {S,S) is a measurable space, define the general [/-statistic 
with kernel H : —^M. based on S*- valued random variables Zi , . . . , Z„ as 

where is defined as in (II ■2p with m = k. (Note that we do not require H to be symmetric 
here.) For a bandwidth < h < 1 and g & J^, consider the [/-kernel 

G',,,„t(x,y) := giy)Mt - x), x,y,t G M"^, 
and for the sample (Xi, Yi), . . . , (X„, F„), define 

Ur.{g,h,t) := Ut\Gg,,„,) = ^^^^^ Yl G'.At(Xi, Yi), 

where throughout this paper we shall use the notation 

X=(Xi,...,XjeM™ and X; := (X,,, . . . , X,J G M^ i G 
Y= (Fi,...,r„) gM"^ and Y; := (F,,, . . . , F.J G M^ i G /^ 

Now introduce the U -statistic process 

un{g, h, t) := V^{Un{g, h, t) - EU^ig, h, t)}. (1.4) 

We shall establish a strong uniform in bandwidth consistency result for the [/-statistic 
process in (11.41) . Theorem [1] gives such a result for bounded classes of functions JF, while 
Theorem [2] is applicable for unbounded classes JF which satisfy a conditional moment condition 
stated in (II. 6p below. In the bounded case, we assume that the envelope function of JF is 
bounded by some finite constant M, i.e., (11.51) holds. 
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Theorem 1 Suppose that the marginal density fx of X is hounded, and let an = c(logn/n)^/'" 
for c > 0. // the class of functions T is hounded in the sense that for some < M < oo, 

F{y)<M, yGM™, (1.5) 

we can infer under the ahove mentioned assumptions on T and K, that for all c > and 
< 6o < 1 there exists a constant < C < oo such that 



V^\Un{g,h,t) - EUnig,h,t)\ 
limsup sup sup sup : < a.s. 

n^oo an<h<bo geJ^ teR"^ a/ | log /l | V log log n 



Theorem 2 Suppose that the marginal density fx of X is hounded, and for c > let a'„ = 
c((logn/n)i^2/p)i/m^ jjjr 

is unhounded hut satisfies for some p > 2 

fip := sup E[F^'(Y)| X = x] < oo, (1.6) 

we can infer under the ahove mentioned assumptions on T and K, that for all c > and 
< 6o < 1 there exists a constant < C" < oo such that. 



V^\Un{g,h,t) -EUn{g,h,t)\ 
limsup sup sup sup : < , o-s. 

n-*oo a'„<h<bo g€Ttm"^ y | log /l | V log log n 

From now on, we shall write m„^(p(t, h) for the estimator of the regression function defined 
in (11.11) to stress the role of v'(y). It is clear that m„_<^(t, h) can be rewritten for all G as 

where we denote by f/„(l,/i, t) the [/-statistic Un{g,h,t) with g = 1. To prove the uniform 
consistency of m„ ,^(t,/i) to m^(t), we shall consider another, but more appropriate, centering 
factor than the expectation Em„ <^(t, h), which may not exist or be difficult to compute. Define 
the centering 

^m^At,h) := E[/„(i,;.,t) - 

This centering permits us to apply Theorems 1 and 2 (depending on whether the class is 
bounded in the sense of (II. 5p or unbounded in the sense of (II. 6p ) to derive results on the con- 
vergence rates of the process m„ <^(t, h) — Em„^(p(t, h) to zero and the consistency of m„^(p(t, h), 
uniformly in bandwidth. 
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For any compact interval / = [c, d\ with —oo<c<d<oo and r/ > 0, define = 
[c — rj,d + rj\ and denote as usual the marginal density function of X hy fx- Then introduce 
the class of functions defined on the compact subset J™ = f x . . . x of M™, 

M = {m^i-m-) : ^ G (1.8) 

where the function / : ^ M is defined as 

7(t) := / f{ti,yi)---f{tm,ym)dyi...dym = fx{ti)---fxit„,). (1.9) 



We have now introduced all the notation that we need to state our results on the uniform 
consistency of the conditional ?7— statistic estimator proposed by Stute for the general regression 
function, where this consistency is uniform in G and in bandwidth as well. 

Theorem 3 Besides being bounded, suppose that the marginal density function fx of X is 
continuous and strictly positive on the interval J = , where I = [c, d] is a compact interval 
and rj > 0. Assume that the class of functions M. is uniformly equicontinuous. Then it follows 
that for all sequences < 6„ < 1 with bn 0, 

sup sup sup |Em„_^(t, h) — m^{t)\ = o(l), 

o<h<bn <feJ^tei"^ 

where = I x . . . x I . 

Theorem 4 Besides being bounded, suppose that the marginal density function fx of X is 
continuous and strictly positive on the interval J = T'^ , where I = [c, d] is a compact interval 
and rj > 0. Then it follows under the above mentioned assumptions on T and K, that for all 
c > and all sequences < 6„ < 1 with < 6„ — > 0, there exists a constant < C" < oo such 
that, 

Vn/i™|m„ <^(t, /i) -Em„^(t,/i)| 
limsup sup sup sup : < , o-s., 

n-»oo a;^<h<fe„ i^ejc-tg/m a/ | log /?. | V log log n 

where = I x . . . x I and a'^ is either an or a'^ depending on whether the class T is bounded 
or not, i.e. whether U.5\) or U.6]) holds. 

The following proposition follows straightforwardly from Theorems [3] and HI 

Proposition 1 Under the assumptions of Theorems 3 and 4 on fx and the classes T and K,, 
it follows that for all sequences < a„ < 6„ < 1 satisfying bn — * and nan/ logn oo, 

sup sup sup |m„<^(t,/i) — m(p(t)| — > 0, a.s., (1-10) 

where I'^ = I x . . . x I . 

It is readily seen that one can take a„ = in the previous proposition and obtain strong 
uniform consistency of Stute's estimator (11. ip for general bandwidths. However, note that by 
choosing a„ = a„, one would only obtain almost sure convergence to a positive constant c > 

in ([noD. 
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2 Preliminaries for the proofs of the theorems 

Let \t' be a real valued functional defined on a class of functions Q and g a real valued function 
defined on M'^, d>l. Occasionally we shall use the notation 

||^(G)||g = sup|^(G)| and H^IU = sup |^(x)|. (2.1) 

In the sequel we shall need to symmetrize the functions Gg^h,t{-, ■)• To do this, we set 
G'g,/x,t(x,y) := {m\y^ ^ Gg,h,t{^a,ya) = (m!)"^ ^ ^(y„)^,j(t - x^), 

where z^- := (-Zo-i, . . . , z„^). Obviously, the expectation of Gg,/i,t remains unchanged after sym- 
metrization, and U^\Gg^h,t{-, ■)) = Un{g,h,t), and thus the [/-statistic process in (11. 4p may 
be redefined using the symmetrized kernels, i.e. we consider 

ur,{g,h,t) = V^{Ut\Ggj,,,) - EUt\Gg,,,,)}. (2.2) 

Moreover, the Hoeffding decomposition tells us that 



u^{gX^) = Um'\^kGg,,,,{;-)), (2.3) 

k=i ^ ^ 



where the fc-th Hoeffding projection for the (symmetric) function L : S"^ x S*™" — ^ R is defined 
for Xfc = (xi, . . . , Xfc) G 5"^ and jk = {yi, ■ ■ ■ ,yk) & S'' as 

7rfcL(xfc,yfc) := {5^,,^y,) - P) x . . . x (5(,^,j^^) - P) x P'^-^L), 

where P is any probability measure on (S*, 5). Considering (Xj, Yi),i > 1 i.i.d-P and assuming L 
is in L2{P^), this is an orthogonal decomposition, and E[7rfcL(Xjt, Yk)\{X2, Y2), . . . , (X^, 1^)] = 
0, A; > 1, where we denote and for (Xi, . . . , X^) and (Yi, . . . , Yk) respectively. Thus the 
kernels vTfcL are canonical for P (or completely degenerate, or completely centered). Also, vTfc, 
k > 1, are nested projections, i.e., vr^ o tt; = vr^ if < /, and 

E[(7rfcL)2(Xfc, Yfc)] < E[(L - EL)2(X, Y)] < EL2(X, Y). (2.4) 

For more details consult de la Pena and Gine [2]. 

Since we assume JF to be of VC-type with envelope function F, and /C to be of VC-type 
with envelope k, it is readily checked (via Lemma A.l in Einmahl and Mason [1]) that the class 
of functions on x given by {/i'"Gg,/,,t(-, ■) : g E < h < l,t e M"*} is of VC-type, as 
well as the class 

g = {h"'Gg,h,ti-^ ■):geJ^,0<h<l,te R"}, (2.5) 
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for which we denote the VC-type characteristics by Ai and fi, and the envelope function by 

F(y) = F(x,y) = «:"^F(y.), y G M'". (2.6) 

(Recall {F.ii) and {F.iii) for terminology.) Next, for k = 1, ...,m introduce the classes of 
functions on M'^ x M'^, 

g'^"^ = {h'^'KkGg^hA-r) ■.geJ^,0<h<l,te M'"}. (2.7) 

Then an argument in Gine and Mason [7J shows that each class Q^''^ is of VC-type with char- 
acteristics Ai and vi and envelope function Fk < 2'''||F||oo- (See the completion of the proof of 
Theorem 1 in that paper for more details.) 

3 Proof of Theorem [1] : the bounded case 

We begin with studying the first term of (12.31) . namely 

n 

mv/^f/«(7riG,,^,t(-,-)) = ^ V7riG,,,,t(X„F,). 
Linear term of (12. 3h 

From the definition of the Hoeffding projections and recalling that the sample (Xi, Yi), . . . , (X„, y„) 
is i.i.d., we can say for all (x, y) G that 

7TiGg,h,t{x, y) = E[G',,/,,t((x, X2, . . . , X^), (y, ^2, • • • , Y^))] - EG,,;,,t(X, Y) 
= E[G3,,,t(X,Y)|(Xi,Fi) = (x,i/)] -EG,,;,,t(X,Y). 

Introduce therefore the function on R x R (for clarity we do not indicate the dependence on m) 

Sght- M X M — > R 

(a;,^) ^ m™[G,,,,t(X,Y)|(Xi,ri) = (x,2/)]. 

Then obviously these functions are symmetric. Using this notation we write 

mh"''KiGg^h,t{x,y) = Sg^h,tix,y) - ESg^h,t{Xi,Yi), 

and hence for all g E J-', h E [an, bo] and t G M™, the linear term of the decomposition in (12. 3p 
times h"^ is given by 

1 " 

mh"'V^Ui'\Tr^Gg,h,t) = ^J2^Sg,H,tiX,,Yi}-ESg,H,t{X^,Y,)} 

V ^ • 1 

=: (yn{Sg^h,t), 
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where this last expression is an empirical process based on the sample (Xi, Yi), . . . , (X„, Yn) 
and indexed by the class of functions on M x M, 

Sn = {Sg,h,t{-^ ■):geJ',an<h<bo,te M'"}. 

Clearly Sn C mQ^^\ and the class mQ^^^ has envelope function mFi, where Fi is the envelope 
function of the class Q^^"* defined in ( 12.7p . From the above discussion, this class is of VC-type 
with the same characteristics as Q, and therefore, after appropriate identifications of notation, 
we can apply Theorem 2 of Dony, Einmahl and Mason [3] to conclude that 

mV^\ui^\7r^Gg,h,t)\ ^ ^ 
limsup sup sup sup — < G, a.s. (3.1) 

n^oo a„</i<feo geJ^teK™ a/ I log /i I V log log n 

Alternatively, a straightforward modification of the proof of (14. 9p below with replaced by a„ 
and by M, gives (13.11) as well. 

The other terms of (12731) 

Our aim now is to show that all the other terms of the Hoeffding decomposition are almost 
surely bounded or more precisely that for each k = 2, . . . ,m, 

sup sup sup ^-^ = 0(1), a.s. (3.2) 

a„</i<6o sGJ^teK'" a/ I log /i I V log log n 

Since na^ = c™logn, this will be accomplished if we can prove that for each k = 2, . . . ,m, 

sup sup sup — = = (J = , a.s. (3.3) 

an<h<bo g€:F tm"' log /l| V log log w)^ \ V 0'™^'^""'^ / 

To obtain uniform in bandwidth convergence rates, we shall need a blocking argument and a 
decomposition of the interval [a„,6o] into smaller intervals. To do this, set ne = 2^,i > and 
consider the intervals Hej := [hij-i, hij], where the boundaries are given by h'^j := 2%™. 
Setting L{i) = max{j : hgj < 2bo}, observe that 

K,6o]C [jHi,^ and L(£) ~ log ( ) /log2, (3.4) 

implying in particular that L{i) < 2\ogne. (This fact will be used repeatedly to finish some 
important steps of the proofs.) Next, for 1 < j < L{i), consider the class of functions on 
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as well as the class on x M'^, 



M, 



: g E h E He j,t E 



where Mk = 2 k^M. Clearly, each class Q^j is of VC-type with the same characteristics and 
envelope function as and Qf^^ is of VC-type with the same characteristics as Q^^'> (and thus 
as Q) with envelope function M^^Ft, where is the envelope function of Q'^^\ Notice that 
from (11. Sp . 

Mk > sup {\7rkGg,h,ti^,y)\ : g E < h < l,t E M"}, 



and hence each function in Q^^} is bounded by 1. Define now for n£_i < n < rii, £ = 1,2, . 



W^j, A;, = sup |Vi7(Xi,Yi) 



(3.5) 



From Theorem 4 of Cine and Mason [7j (see Theorem lA.il in the Appendix), we get for c = 1/2, 
r = 2 and all a; > that for any i > 1, 



P 



max U„{j,k,i) >x}< -P{W„,(j,A;,£) > x/2}^/' E[W4(j, fc, £)]i/2. (3.6) 



We shall apply an exponential inequality and a moment bound for [/-statistics due to respec- 
tively de la Pena and Gine [2], and Gine and Mason [7j, on the class Q^''^ to bound (13.61) . In 
order to use these results we must first derive some bounds. Firstly, it is readily checked that 



Wn(j,fc,£)<nf ||f/W(7r,G)|| 



(3.7) 



for all rii^i < n < n^. (Recall the notation (12. ip .) Secondly, notice that in {K.i), K is 
assumed to be bounded by k and has support in [—1/2,1/2], such that by assumption (11.50 
and Mk = 2^kJ^M, for H E Qf] we have by ([23D 

Eif2(x,Y) < M^-^/i^^EGj^^tlX.Y) 



g\Y)K'^ 



< h"^A-'\\fx\\Z- 



t -X 



For Dm = 4 this gives us that 



sup EH\-X,Y) < Dmhl^ =: ajj. 



(3.8) 
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Since i^kT^kL = HkL for all A; > 1, we can now apply Theorem IA.4l to the class Q^^^ with ajj as 
in fl3.8l) . and obtain easily that for some constant A^, 

mlij.kA) < ^nWlth'^kHW.,, < 2^A,/.- |logV,f . (3.9) 



To control the probability term in fl3.6p . we shall apply an exponential inequality to the same 
class Of^j (recall that each H G Gi'j is bounded by 1). Setting 

y* = Ci,fc(|log/i,J Vloglogn,)'^/^ ^. c,,^x,,k{i), (3.10) 

where Ci^k < oo, Theorem IA.6I gives us constants C2,k, Cz,k such that for j = 1, . . . , L{t) and 
any p > 1, 

P (j, k, t) > p'/'y* } < C2,k exp { -C3,,py*2/^' } 

< exp{-C4 A.ploglogra£} . (3.11) 

Then plugging the bounds (13. 9p and (13. lip into (13.60 . we get for some C^^k > 0, any p > 2 and 
i large enough, 



P I max Un {j, k, £) > 2p^''^y* \ < 

[n^_i<n<n^ J 



{\ogn,)-P^j2kAkh^A\ogh 



Ci,kVp''{\ loghej V loglogn^)^' 

< y^(logn^)-''^5-'=. (3.12) 

Finally, note also that 

n/^'WUi'^nkG^g,^^ < CkMkUnU.kA). (3.13) 
for some > 0. Therefore by (13.40 . for each k = 2, . . . ,m and £ large enough, 

^ V^\U^'\7rkGg,h,t)\ 
max Ank '■= max sup sup sup — = 

ne-i<n<ne ' '^f-i<"<'^f a„</i<feo seJ^ teR™ a/( | log /l| V log log n)'' 

V^\ui''\7rkGg,h,t)\ 

< max max sup sup sup — 

ni-i<n<nil<j<L{e) h^-H^,. gi^jr^^^,^ ^(^\logh\ Vloglog^^)^ 

^ CkMk Knij,k,i) 

< — ^^=^= max max — — , 

where \j,k{£) was defined as in (13.101) . Now recall that hij < 2bo < 2 for j = 1, . . . ,L{i) and 
that L{i) < 21ogn£. Then (I3.12p applied with p > (2 + S)/C^^k, S > and in combination 

with the above inequality and the obvious bound ^ya!^^nF~^ An,k < \/ '3'™^^£~^^n,fc valid for all 
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rii^i < n < ni, implies for Ce,^ > 2p'^l'^CkMkCi^k that for A; = 2, . . . , m 



P<! max ^Ja^n^-^An,k > C^,k \ < V ./^(logn^)-''^^.'^ 



< L(£)V2™(logn^) 

< V2'"+2(£log2)"(^+^). 

This proves via some elementary bounds and Borel-Cantelli that (13.31) holds, which obviously 
implies (13.21) , and hence completes the proof of Theorem [TJ 



4 Proof of Theorem 2 : the unbounded case 

In case ( II. 5p is not satisfied, we consider bandwidths lying in the slightly smaller interval 
H'n^ = [o-'n^i ^o] that can be decomposed into the subintervals 

■= [K,,-i, K,] with /^;- := 2%'-. (4.1) 

Note that it is straightforward to show that (13. 4p remains valid if we replace hij by h'^j. In 
particular, we still have L{i) < 21ogn^ where L{i) is now defined as L{i) := max{j : h'^j < 2bo}. 
Recall that = 2^ £ > and set for £ > 1 

7^ = n^/logn£. (4.2) 

For an arbitrary e > we shall decompose each function in Q as 

G,,,,t(x,y) = G,,,,t(x,y)I{F(y) < 57^^} + G,,,,t(x, y)I{F(y) > 57^^} 

=: G(^lt(x,y) + G(^i,t(x,y), 

where -F(y) is the (symmetric) envelope function of the class Q as defined in (12. 6p . Then 
Un{g, h, t) can be decomposed as well for any n£_i < n < ne, since from (12.20 . 

u^ig, K t) = v^{f/r)(G^^l,t) - ^Ut\G^i,)] + V^{Ulr\G^U - ^Ulr\G%^,)] 
=: u':^\g,h^) + u^^\g,h^). 

The term Un\g,h,t) will be called the truncated part and Un\g,h,t) the remainder part. 
To prove Theorem [2] we shall apply the Hoeffding decomposition to the truncated part and 
analyze each of the terms separately, while the remainder part can be treated directly using 
simple arguments based on standard inequalities. Note for further use that 

< = c'"7'^'"\ ^>1- (4.3) 
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4.1 Truncated part 



Note that from fl2.3p we need to consider the terms of Yll^=i (T)^" (^fe^git)- shall start 
with the linear term in this decomposition. Following the same reasoning as in the previous 
section, we can show that vTiG^^jj^ is a centered conditional expectation, and that the first term 



of (12.31) can be written as an empirical process based upon the sample (Xi, Yi), . . . , y„ 
and indexed by the class of functions 



S[:=[s^l,{-,-):ge:F,heK^,teW-] 



where was defined in the beginning of this section, and where 



To show that S[ is a VC-class, introduce the class of functions of (x, y) G x 
C = |/i'"G,,;,,t(x,y)I{F(y) < c} : G ^, < /i< 1, t G M'", c> 



Since both Q as defined in (12.51) and the class of functions of y G M™" given by X = |l{F(y) < 
c} : c > O} are of VC-type (and note that X has a bounded envelope function), we can apply 
Lemma A.l in Einmahl and Mason [1] to conclude that C is of VC-type as well. Therefore, so is 
the class of functions mC^^^ on M^, where C^^^ consists of the TTi-projections of the functions in the 
class C. Thus we see that S'^ C niC^^'' and hence S'^ is of VC-type with the same characteristics 



as TnC^^\ Now, to find an envelope function for S'^, set := (ti, . . . , tj^i, tj+i, . . . , t^) ^ 



pm— 1 



and Zj{u) := {Zi, . . . , Zj_i,u, Zj+i, . . . , Z„ 
the function S^^\^_{x,y) G S'^ as 



for M G M and Z G W^. We can then rewrite 



X 



h 



E 



9{^i{y))K 



ti -X* 



l{F{YM)<ei]''} 



K 



X 



E 



h 

t2-X* 

h 



+ ...+ K 



E 



l{F{Y,{y)) < e^J'} 
-X* 



h 



1{F{YM) < eiy^} , (4.4) 



5 ^m) £ 



pm— 1 



and where (with abuse of notation here) the product kernel 



where X* = (Xa 

in {K.iii) is now defined for (m — 1) -dimensional vectors, i.e. K{u) = YYi^i^ K{ui), u G 
Hence, we can bound Sg^jj^^^{x,y) simply as 

\Sl'i,{x,y)\ < K"'{E[F{y,Y2,...,Y^)]+E[F{Y2,y,Ys,...,Y^)] 



pm— 1 



+ ...+ E[FiY,,...,Y„,,y)]} 
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We shall now apply the moment bound in Theorem lA. 31 to the subclasses 
■■= {sl'U-^ ■):ge:F,he K^^,t e R™} , 1 < j < 

where Ti^j was defined in (14. ip . Since C S[ for j = 1, . . . ,L{i), all these subclasses are 
of VC-type with the same envelope function and characteristics as the class mC^^^ (which is 
independent of i), verifying {ii) in the Theorem. For (i), recall that although all the terms of 
the envelope function Gm{x,y) are different, their expectation is the same. Therefore, denoting 
Y* for {Y2, . . . ,Ym) and applying Minkowski's inequality followed by Jensen's inequality, we 
obtain from assumption (11. 6p the following upper bound for the second moment of the envelope 
function. 

EG^(X,F) = K^"'Ey{EY*[F{Y,Y2,...,Y,^)]+EY*[F{Y2,Y,Ys,...,Ym)] 

+ ...+ EY4FiY2,...,Yr^,Y)]f 
< m2«:2™EF2(Fi,...,yj 

Note further that by symmetry of F, 

EG(1,(X, Y) = h-E[g{Y)K(^^^)l{F{Y) < .7^"}], 

such that Jensen's inequality, the change of variable u = (t — x)//i and the assumption in ( 11. 6p 
give the following upper bound for the second moment of any function in : 

HSl'Ux,Y)f < m'E[g\Y)k\^)l{FiY) <s^y^}_ 

< m^K^'^K^j E[F^{Y)\X = t-hM]fx{ti-hui)...fx{trr,-hum)d\i 



2 ' 2 1 



< m'K'^^^J^WfxWZK^. (4.5) 

Therefore, with (3 = mK^^y^{l V our previous calculations give us that 

EG^(X, Y) < and sup ES\X, Y) < ^'^h'^^ =: erf 

verifying condition {Hi) as well. Finally, recall from (12. 6p that since Q has envelope function 
F(y), it holds for all x, ?/ G M that 

\S^lA^.y)\ < mE[F(Y)I{F(Y) < s^y'}UX,,Y,) = (x,y)] < me^y^, 
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such that by taking e > small enough, Theorem IA.3I is now applicable, and gives us an 
absolute constant Ai < oo for which 



rii 



i=l 

< Ai^n,h'^.{\ \ogh',.\ V log log n,) 

=: AiA;.(£), (4.6) 

where ei, . . . ,e„^ are independent Rademacher variables, independent of (Xj,Fj), 1 < i < n^. 
Consequently, applying the exponential inequality of Talagrand [9j to the class (S^ ^ (see Theorem 

lA.Sl in the Appendix) with M = me'-)\^'^ , cr|, = j3'^h'[] and the moment bound in (14.61) . we get 

for an absolute constant < oo and all t > that 



pj max \\V^an\\s'>Ci{Ai\'M)+t)\ 



< 2 



A2V \ / A2t 



(4.7) 



Towards applying this inequality with t = pX'j{£),p > 1, note that it clearly follows from (14.31) 
and the definitions of h'f j and X'j{i) that for all j > 0, 

x'He) , , , 

I log hi, j I V log log Tie > log log rii, 



2^c" logn^d logh'i A V log log n^) > c™(loglogn£) 



7/ 



Consequently, (14.71) when applied with t = pX'j{i) and any p > 1 with i large enough, yields for 
suitable constants A'2, A!^ and A3, the inequality 



p( max llv^aJI^; , >Ci(Ai+p)A'(£)j 



< 2 [exp (-A'ap^ log log ne) + exp (-Agp log log rii)] 

< 4(logn£)-^«''. (4.8) 



Keeping in mind that mh'^^/nUn\^^lG^g\^ is an empirical process an{S^g\^ indexed by the 
class iS^, and recalling (13.41) . we obtain for £ > 1 that, 



max A'^^ := max sup sup sup 



""n,£ • ------ ^v^i- ^^^^ _ — ^ — 

ni_i<n<ni ^l-l<r^<^l a'^<h<bo g(^J^ t&J^ a/ | log /l | V log log 
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n 



< max max sup sup sup — 

n,_i<n<n, i<j<m ae^teR^ Jneh'i^jil logh'^ .\ V loglogn^) 

3|v^a„(ff)| 

< max max sup — — — . 

ne_j_<n<ne l<j<L(£) jj^^s' . X Al) 

£.j J 

Consequently, recalling once again that L{i) < 2 log rig, we can infer from (14.81) that for some 
constant C^^p) > 3Ci{Ai + p), 

r 1 ''^'^ r 

P<^ max A'^,>C5ip)\ < Y.^ { max \\y/^a4s' > + p)\'Ai) 

J j=l 

< 8 (log n^) ^-^3". 

The Borel-Cantelli lemma when combined with this inequality for p > (2 + (5)/A3, 5 > and 
with the choice ni = 2^, establish for some C < oo and with probability one, that 

limsup max sup sup sup ' ' — < C , (4.9) 



e->oo ^t~i<"-^'^e a'^<h<bo g&:F tm"^ a/ | log /l | V log log 



n 



finishing the study of the first term in (12. 3p . We now show that all the other terms of (12.31) 
are asymptotically bounded or go to zero at the proper rate, which will be obtained if we can 
prove that for k = 2, . . . ,m and with probability one. 



max sup sup sup — - = C(7^ ). (4-lU) 

ri<!-l<™<n^ a/^</i<fcQ ggjrteK"' A/|log/l| V log log n 



Analogously to the bounded case, we start by defining the classes of functions on x R'" and 



Then it is easily verified that these classes are of VC-type with characteristics that are inde- 
pendent of i, and with envelope functions F and (2'^e7/^)~^Ffc respectively. The function F is 
defined as in (12.61) and Fj. is determined just as in the proof of Theorem 1 in Gine and Mason 
[7j. Note that, in the same spirit as (13. 5p and (13.71) . by setting 



U'^{j,k,i):= sup J^^/f(Xi,Yi) 



k /2 



rie-i <n<ni, 
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we have for all k = 2, . . . ,m and < n < n£, 

K{j,k,e)<n'/'\\Ui'\7rkG)\y.^. 

Consequently, applying Theorem lA.ll with c = 1/2 and r = 2, gives us precisely (13. 6p with 
Un{j,k,i) and UnXj,k,£) replaced by U'^{j,k,i) and W^^{j,k,£) respectively. Therefore the 
same methodology as in the bounded case will be applied. Note also that, as held for all the 
functions in Qi'j , the functions in Q'^^j'^ are bounded by 1, and have second moments that can 
be bounded by h'^Dm for a suitable Dm by arguing as in (14. 5 p and (13. 8p . Consequently, the 
expression in (13. 8p is satisfied for functions in Q'^^j^ as well, i.e. 

sup EH\X,Y) < DmKl =: a'lj. 

Hence, all the conditions for Theorems IA.4I and IA.6I are satisfied, so that after some obvious 
identifications and modifications, the second part of the proof of Theorem [1] (and (I3.12p in 
particular) gives us for all j = 1, . . . , L{i) and any p > 2, 

pj max W;(j,A;,£) > 2p'=/V*l < AA^Oog^^)"^'''", (4-11) 

with y'* = C[ fcA^. ;,(£), and where A^-^(£) is defined as in (l3.1Up with h^j replaced by h'^-. Now, 
to finish the proof of (I4.10p . note that similarly to (I3.13p . for some > 0, 



This gives that 



max^ ^n£A: max^ sup sup sup 



ni-i<n<ni "^-i<'^<"« aJ,</i<bo <?6j^ teK™ a/(| log /l| V log log n)^ 

< — , max max 



L/m^fc-l n,_i<n<n, l<j<L{E) A^(£) 



From (14. 3 p we see now that '^1^^ /a'^^n\ ^ = c "^n^ ''/logne. Therefore by choosing Cs,k > 
2''+^c-"'/^€CkC[j^{{2 + S)lCT,Kfl'^ and noting that h!^ - < 2 for all j = 1, . . . , L{i), we can infer 
from fHrTTD that 



P| max J^a;,,,,>C8,4 < v^(logn,)-(^+^) 

L ni_i<n<ni V ?^ J 

This implies immediately via Borel-Cantelli that for all k = 2, . . . ,m and i > 1, 

max sup sup sup — = = 

"f-i<n<n^ a^</i<bo ae.^^teM'" y(|log/i| V log log n)'^ 
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which obviously imphes (14.101) . Finally, recalling the Hoeffding decomposition (12. 3p . this implies 
together with (14.91) that with probability one, 

hmsup max sup sup sup : — < C . (4.1/) 

£^oo "<-i<"-^"« a;j</i<6o geJ^tGR™ a/ I log /i I V log log n 

4.2 Remainder part 

Consider now the remainder process Un\g, h, t) based on the unbounded (symmetric) [/-kernel 
given by 

Gilt(x,y) := G,,,,t(x,y)I{F(y) > 57^"}, 

where we defined 7^ as in (14.21) . We shall show that this [/-process is asymptotically negligible 
at the rate given in Theorem [2l More precisely, we shall prove that as £ ^ 00, 

max sup sup sup : — = o(l), a.s. (4-lo) 



"'i-i<i^<ni a'„<h<bo geJ^teK™ a/| \ogh\ V loglogn 

Recall that for aA\ g e J^, h e [a'„,6o] and t,x G M™, F{y) > h"'\Gg^h,t{'^,'y)\, so from the 
symmetry of F, it holds that 

\Ui'^HG'^lt)\ < h""Ui-'> (f . \{F > e^y^} 



where ut\F ■ 1{F > is a [/-statistic based on the positive and symmetric kernel 

y F(y)I{F(y) > £7^^}. Recalling that = c™(log?2/n)^~^/P, we obtain easily that for all 
g eJ^, he [<, 60], t e M"* and some C > 0, 



v^l^^^(g^lt)l ^ ^eUt\F.l{F>e^y'}) 
max — ,. < 

n£_i<n<ni 



v/| log /i I V loglogn ^<7(|log<J Vloglogn^) 

< C^r^'Uif{F.l{F>e^y^}). 



Arguing in the same way, since a [/-statistic is an unbiased estimator of its kernel, we get that 
uniformly in g E J-', h & [a'„, bo] and t G M™, 

max JL_ < C-i^ ''^E[/^7^ {F ■ \{F > e-f/"}) 

ne_i<n<ne | log /?. | V log log 72 

< C'E[FP{Y)1{F{Y) > e^y^"}]. (4.14) 
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From fl4.14p we see that as £ ^ oo, 

V^\EUi"'\G%,)\ ^ ^ ^ ^ 

max sup sup sup — - = o(l). (4-15) 

"«-i<ri<n« a/^</i<5o geJ^tGK™ A/|log/l| V log log n 

Thus to finish the proof of (14.131) it suffices to show that 

f7l7)(F ■ 1{F > e^y^}) = oi^y^-'), a.s. (4.16) 

First note that from Chebyshev's inequahty and a well-known inequality for the variance of a 
[/-statistic (see Theorem 5.2 of Hoeffding [8]) we get for any 6 > 0, 

P 1 1 f/M . 1{F > e^y^}) - EfTl^) {F . 1{F > .7^^}) | > H"^'"^^^^ } 

< 5-%r/-Var(f;l-)(F.I{F>.7y^})) 

1-2/p ^ ^ 

< m6~ \ E[F^(Y)l{F(Y)>57y^}]. (4.17) 

(logn^)^ 

Next, in order to establish the finite convergence of the series of the above probabilities, we 
split the indicator function 1{F{Y) > £7^^} into two distinct parts determined by whether 
F(Y) > ny^ or £7^^ < F{Y) < ny^, and consider the corresponding second moments in 
f l4.17p separately. In the second case, note that from (11.61) and (12. 6p . KF^{Y) < ^pK,P"^{m\y, 
and observe that since p > 2 and rii = 2^, 

00 l-2/p 00 

E (loZ )2-2/, nF'iY)l{FiY) > ny^}] < E[F^(Y)] J^ihgn^r^'-'M < 00. 

To handle the first case, we shall need the following fact from Einmahl and Mason [1]. 

Fact 1 Let (c„)„>i be a sequence of positive constants such that Cnjvt^^^ 00 for s > 0, and 
let Z be a random variable satisfying J2'^=i ^{\^\ > c„} < Then we have for any q > s, 

00 

Es'^EdZl'^IdZl < C2fe}]/(c20^ < 00. 

k=l 

Setting c„ = n^/'' into Fact 1, we conclude from this inequality that for p < s < r < 2p, 

00 l-2/p 
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^ ^ e'-^ n,E[F^(Y)I{F(Y) < n]'"}] 



Finally, note that the bound leading to (14.141) implies that 

lVm^-\F.l{F>e^]''})=o{l). 

Consequently, the above results together with (14.171) imply via Borel-Cantelli and the arbitrary 
choice of 5 > that (I4.16P holds, which when combined with (14.151) and (14.141) completes the 
proof of (14.131) . This also finishes the proof of Theorem [2] since we have already established the 
result in fHl2|l . 



5 Proof of Theorem S] : uniform consistency of m^(t,/i) 
to m^{t) 

Theorem 3 is essentially a consequence of Theorem IA.2I in the Appendix. Recall that a U- 
statistic with ?7-kernel H is an unbiased estimator of Eif . Writing (hi. and dy for dxidx2 ■ ■ ■ dxm 
and dyidy2 ■ ■ ■ dym respectively, we see that 

EC/„(1. ft. t) = / AVt - x)/(., yO . . . y^)i.iy = / * *-,.(*), 

where the function / : M is defined in (II. 9p . Since we assume fx to be continuous on 

J = I^, the function / is continuous on J*" = J x . . . x J. Therefore we can infer from Theorem 
[Q that 

sup sup |Ef/„(l,/i,t) - /(t)| — >0, (5.1) 

0<h<b„ te/™ 

for all sequences of positive constants 6„ ^ 0, and where I"^ = I x . . . x I. In the same way, 
notice that 



E[/„(V9, /i,t) = I ip{y)Kh{t - x.)f{xi,yi) ■ ■ ■ f{xm,ym)d:>cdy 

E[<^(Y)|X=-] /(•)}* K.(t). 
Hence, Theorem IA.2I applied to the class of functions Ai as defined in (11.81) gives that 

sup sup sup |E?7„((y9, /i, t) — m<^(t)/(t)| — > 0. (5.2) 

o<h<bn 'peJ^tei"' 

Keeping in mind the definition of Em„^^(t, h) in (II. 7p . it is clear that since fx is bounded away 
from zero on J, (15. ip and (15.21) imply that 

sup sup sup \M7Jln,ip{t, h) — m<^(t)| = o(l), 

finishing the proof of Theorem [31 
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6 Proof of Theorem [4] : convergence rates of the condi- 
tional ?7— statistic rhn^^it^h) 

Observe that 



< 









EUn{l,h,t) 


Un{'f,h,t) 


-EUn{^,h,t)\ 




[l,h,t)\ 



\EUr,iip,h,t)\ ■ 




/i,t) -E?7„(l,/i,t)| 




Un{l,h,t)\- 


EUn{l,h,t)\ 



+ 

=■■ (I) + (!!)■ 

From Theorem [H (15. ip and fx bounded away from zero on J we get for some ^1,^2 > and c 
large enough in a„ = c{\ogn/ny^'^, 

hminf sup sup |?7„(1, /;,, t)| = ^1 > 0, a.s., 

an<h<bn tf^I'"^ 

and for n large enough, 

sup sup |Ef/„,(l, /i, t)| = ^2 > 0. 

a„<h<b„ te/'" 

Further, for a'^ be either a„ or a'^, we obtain readily from the assumptions (11.51) or (II. 6p on the 
envelope function that 

sup sup sup iKUni^p, h,t) \ = 0(1). 

Hence, we can now use Theorem [1] to handle (I), while for (I), depending on whether the class 
satisfies (11.50 or (II. 6p . we apply Theorem [1] or Theorem [2] respectively. Taking everything 
together we conclude that for c large enough and some C" > 0, with probability one, 

Vnh"^\rhnJt, h) - Em„^(t, h)\ 

hmsup sup sup sup : 

n^oo a^</i<6„ ipeJ^teJ™ a/ | log /i | V log log n 



< hmsup sup sup sup 



n^oo a'^<h<b„ <fi£:Ftei"^ ^y\\ogh\ V log log n 



+ hmsup sup sup sup 



n^oo a'^<h<bn if&r t&i"^ a/I log /i| V log log n 

< cr 



proving the assertion of Theorem HI 
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A Appendix 



The first result below is stated as Theorem 4 in Gine and Mason [7], and is essentially a con- 
sequence of a martingale inequality due to Brown The second Theorem is a generalization 
of Bochner's lemma. 

Theorem A.l (Theorem 4 of Gine and Mason, 2007b) Let Xi, X2, . . . be i.i.d. S-valued 
with probability law P. Let Ti be a P -separable collection of measurable functions f : ^ R 
and assume that H is P -canonical (which means that every f in H is P -canonical) . Further 
assume that E||/(Xi, . . . ,Xfe)||7^ < 00 for some r > 1, and let s be the conjugate of r. Then, 
with Sn defined as 

Sn = sup f{Xi^, . . . 



77. 



n > k, 



we have for all x > and < c < 1, 

P<^ max Sr„ > x[ < ^ f K nJ 



~ m - - I / -, \ 

k<m<n ) X 1 — C 



Theorem A. 2 Let I = [a,b] be a compact interval. Suppose that Ti is a uniformly equicontin- 
uous family of real valued functions if on J = [a~ri,b + r]Y for some d > 1 and t] > 0. Further 
assume that K is an Li-kernel with support in [—1/2, 1/2^ satisfying J^^ K{u)du = 1. Then 
uniformly in (f ^Ti. and for any sequence of positive constants bn 0, 

sup sup \ip * Kh{z) — ip{z)\ — > 0, as n 00, 

0<h<bn x&Id 

where Kh{z) = h~'^K (z/h) and 
A.l Moment bounds 

Theorem A. 3 (Proposition 1 of Einmahl and Mason, 2005) Let Q be a pointwise mea- 
surable class of bounded functions with envelope function G such that for some constants 
C, z/ > 1 and < a < (3, the following conditions hold: 

(i) EG2(X) < (3^; 

(11) M{e,g) <Ce-'', 0<e<l; 

(ill) al := supg^gEg\X) < a^; 



(iv) sup^gg \\g\\^ < 4^A/ncrVlog(Ci/5/a), where Ci = C^^" V e. 



4v^ 
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Then we have for some absolute constant A, 

n 

n\Y,e^g{X,)\\g < A^vnanog{Cil3/a), 

i=l 

where are i.i.d Rademacher variables independent of Xi, . . . , X„. 

Theorem A. 4 (Corollary 1 of Gine and Mason, 2007b) Let T be a collection of mea- 
surable functions f : 5™" —>■ M, symmetric in their entries with absolute values bounded by 
M > 0, and let P be any probability measure on {S,S) (with Xi i.i.d-P). Assume that T is 
of VC-type with envelope function F = M and with characteristics A and v. Then for every 
m & N, A > e'^,v > 1 there exist constants Ci := Ci{m, A,v, M) and C2 = C2{m, A,v, M) 
such that for k = 1, . . . , m, 

n'E\\Ui'\n,f)r^<C',2'a' (^log ^ 
assuming na"^ > C2 log(y4/cr), where is any number satisfying 




A. 2 Exponential inequalities 

Theorem A. 5 (Talagrand, 1994) Let Q be a pointwise measurable class of functions satis- 
fying 

\\g\\oo < M < 00, g eg. 

Then we have for allt>0, 

P< max \\Vmarn\\g > Ai(E\\y^ eig{Xi)\\ +t) 

I l<m<n ' ' ^ ' ' » 

L - - i=i 

where cTg = sup^gg Var{g{X)) and Ai,A2 are universal constants. 

We now state the exponential inequality that will permit us to control the probability term 
in (13.61) . and which is stated as Theorem 5.3.14 in de la Pena and Gine 0. 

Theorem A. 6 (Theorem 5.3.14 of de la Pena and Gine, 1999) LetH be aVC -subgraph 
class of uniformly bounded measurable real valued kernels H on {S"^,S^), symmetric in their 
entries. Then for each 1 < k < m there exist constants c^, dk G ]0, cxd[ such that, for all n > m 
and t > 0, 

[\\n'/'ui'\TT,H)\\n >t}< Cfcexp{-4t'/'}. 
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